Polynomial data processing operation

ABSTRACT

A data processing system  2  includes an instruction decoder  22  responsive to polynomial divide instructions DIVL.P N  to generate control signals that control processing circuitry  26  to perform a polynomial division operation. The denominator polynomial is represented by a denominator value stored within a register with an assumption that the highest degree term of the polynomial always has a coefficient of “1” such that this coefficient need not be stored within the register storing the denominator value and accordingly the denominator polynomial may have a degree one higher than would be possible with the bit space within the register storing the denominator value alone. The polynomial divide instruction returns a quotient value and a remainder value respectively representing the quotient polynomial and the remainder polynomial.

BACKGROUND OF THE INVENTION

1. Field of the Invention

This invention relates to the field of data processing systems. Moreparticularly, this invention relates to data processing systemsproviding support for polynomial data processing operations.

2. Description of the Prior Art

It is known within data processing systems to provide some support forpolynomial arithmetic. For example, it is known to provide support forpolynomial arithmetic associated with Reed Solomon coding or EllipticCurve Cryptography. One known data processing system providing suchsupport is the digital signal processor produced by Texas Instruments asTMS320C64x. These digital signal processors provide an instruction toperform the operation:a=b*c mod p (where b is 32-bits and c is 9-bits), and

where p is held in a special 32-bit register (GPLYA or GPLYB)

This known form of polynomial instruction yields the remainder portionof a polynomial multiplication providing good support for Reed Solomoncoding. It is not suited to other forms of polynomial data processing,such as that associated with signal scrambling or the calculation oftransmission codes.

It is also known to provide special purpose hardware for the purpose ofsignal scrambling or generating transmission codes. Such special purposehardware can be provided in a form capable of performing the necessarycalculations at high speed, but has the disadvantage of consumingsignificant circuit resource for this dedicated function as well asbeing relatively inflexible and illsuited to reuse and/or modification.

SUMMARY OF THE INVENTION

Viewed from one aspect the present invention provides apparatus forprocessing data comprising:

an instruction decoder responsive to a program instruction to generateone or more control signals;

a register bank having a plurality of registers; and

processing circuitry coupled to said instruction decoder and saidregister bank and responsive to said one or more control signals toperform a data processing operation corresponding to said programinstruction upon one or more data values stored within said registerbank; wherein

said instruction decoder is responsive to a polynomial divideinstruction to generate one or more control signals that control saidprocessing circuitry to generate at least a quotient value representinga quotient polynomial for a polynomial division over a field of twoelements of a numerator polynomial by a denominator polynomial, saiddenominator polynomial being an N degree polynomial given by the sum ofc_(i)x^(i) for N≧i≧0 where c_((N-1)) to c₀ are respective bits stored ina register of said register bank and c_(N)=1 and is not stored withinsaid register.

The present technique provides a programmable data processing apparatushaving general purpose elements such as an instruction decoder, aregister bank and processing circuitry with the capability ofadditionally providing a polynomial divide instruction which at leastgenerates a quotient value representing a quotient polynomial resultingfrom a polynomial division. Furthermore, the denominator polynomial isstored within a register of the register bank in a form in which thecoefficient of the highest degree term of the polynomial is fixed at “1”and is assumed rather than requiring to be stored within a register.This permits the denominator polynomial to have a degree one higher thanthe bit-width being used to store the denominator value therebypermitting more effective use of the bit space within the registers ofthe register bank to represent the results of the polynomial divide soas to more readily match the maximum bit-width of possible results.

The register storing the coefficients of the denominator polynomial canbe an N-bit register.

Whilst the polynomials being manipulated can be represented by valuesstored within the registers in a variety of different ways, it isconvenient to represent them by storing the coefficients for thedifferent terms at respective bit positions of values stored within aregister.

The coefficients can be stored in different orders within the valuesheld within the register, such as with the lowest degree term having itscoefficient stored at the most significant bit position progressing tothe highest degree term with its coefficient stored at the leastsignificant bit position, or the opposite way around (e.g. similar tolittle endian or big endian storage).

The numerator polynomial will often be of a higher degree that thedenominator polynomial and accordingly convenient embodiments representthe numerator polynomial by a 2N-bit numerator value stored withineither two N-bit registers or within a 2N-bit register within theregister bank when such wider registers (e.g. accumulator registers) areprovided within the register bank.

The polynomial division instruction may also generate a remainder valuerepresenting a remainder polynomial resulting from the polynomialdivision as well as the quotient value representing the quotientpolynomial. While the quotient polynomial is useful in generatingscrambled signals, transmission codes and the like, the remainderpolynomial is also useful in other circumstances and accordingly it isconvenient if both are generated from the polynomial divisioninstruction.

The remainder value and the quotient value may be conveniently storedwithin respective N-bit registers of the register bank.

The efficiency of the implementation of this technique is improved whenthe register bank used comprises a plurality of general purpose scalarregisters used by program instructions other than the polynomial divideinstruction.

The general applicability of the data processing system incorporatingthe polynomial divide instruction and the ability to reuse this systemfor a variety of functions is enhanced when it additionally provides apolynomial multiply instruction in combination with the above describedpolynomial divide instruction.

Whilst the polynomial divide instruction may often be required in ascalar form, it is also possible that in some embodiments it isdesirable to provide the polynomial divide instruction as a vectorinstruction with the denominator value being a scalar value as thedenominator value will typically change infrequently and need to beapplied to long vector sequences of numerator values to generate vectorsequences of quotient values.

Viewed from another aspect the present invention provides a method ofprocessing data comprising the steps of:

decoding a program instruction to generate one or more control signals;

in response to said one or more control signals, performing a dataprocessing operation corresponding to said program instruction upon oneor more data values stored within a register bank having a plurality ofregisters; wherein

said decoding is responsive to a polynomial divide instruction togenerate one or more control signals that control generation of at leasta quotient value representing a quotient polynomial for a polynomialdivision over a field of two elements of a numerator polynomial by adenominator polynomial, said denominator polynomial being an N degreepolynomial given by the sum of c_(i)x^(i) for N≧i≧0, where c_((N-1)) toc₀ are respective bits stored in a register of said register bank andc_(N)=1 and is not stored within said register.

It will be appreciated that a further aspect of the invention is theprovision of computer programs which incorporate the polynomial divideinstruction discussed above for controlling hardware in accordance withthe present technique.

The above, and other objects, features and advantages of this inventionwill be apparent from the following detailed description of illustrativeembodiments which is to be read in connection with the accompanyingdrawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 schematically illustrates a data processing apparatus includingsupport for a polynomial division instruction;

FIG. 2 schematically illustrates the syntax of a polynomial divideinstruction;

FIG. 3 schematically illustrates an example polynomial divisionoperation;

FIG. 4 schematically illustrates a circuit for performing a polynomialdivision operation in response to a polynomial division instruction;

FIG. 5 illustrates an example syntax for a vector polynomial divideinstruction;

FIG. 6 illustrates two alternative ways in which a polynomial may berepresented by a value stored within a register; and

FIG. 7 schematically illustrates the syntax of a polynomial multiplyinstruction.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

FIG. 1 illustrates a data processing apparatus 2 in the form of aprocessor 4 coupled to a memory 6. The memory 6 stores programinstructions 8, such as program instructions for performing signalscrambling, as well as data values to be subject to processingoperations, such as data values 10 forming a stream to be scrambled andtransmitted.

The processor 4 includes a register bank 12 formed of N-bit registers(e.g. 32-bit registers) as well as (optionally) some 2N-bit registers 14which are provided for use as accumulator registers in association withmultiply accumulate instructions. Processing elements including amultiplier 16, a shifter 18 and an adder 20 perform processingoperations under control of control signals generated by an instructiondecoder 22 in response to program instructions progressing along aninstruction pipeline 24 when fetched from the memory 6. The processor 4is a general purpose processor with a scalar register bank 12, 14 forperforming general purpose data processing operations, such as normallogic and arithmetic operations, in response to program instructionsfetched from the memory 6. The control signals generated by theinstruction decoder 22 configure the data processing elements 16, 18, 20to perform the desired data processing operations.

Additionally provided within the processor 4 is polynomial divisioncircuitry 26 which is responsive to control signals generated by theinstruction decoder 22 to perform polynomial division operations upondata values retrieved from the memory 6 (via the registers 12, 14).These polynomial division operations and the polynomial divideinstruction will be described further below.

FIG. 2 schematically illustrates the syntax of a polynomial divideinstruction DIVL.Pn. In the syntax DIVL.Pn (and divl_pn in the codediscussed later) “n” is the width of the operation (e.g. 8, 16 or 32)and may be less than the width of the register “N”. In the followingexamples it is assumed that N=n, but it will be appreciated that thepresent technique is also applicable when N≠n (e.g. N=32 and n=8, 16 or32). The polynomial divide instruction uses three registers to hold itsinput operands. These registers are N-bit registers within the registerbank 12 in this example. The numerator value, which represents anumerator polynomial, is stored within registers r0 and r1. Thenumerator value is accordingly a 2N-bit value. Denominator value, whichrepresents the denominator polynomial, is stored within register r2. Thedominator value represents the denominator polynomial with theassumption that the denominator polynomial starts with its highestdegree term having a coefficient of “1” and accordingly the dominatorvalue need only represent the coefficients of the terms following thishighest degree term. This permits the denominator polynomial to includea maximum number of terms one greater than the width of the register r2.This is an advantage since the resulting remainder value from such apolynomial divide will have a bit length one less than the denominatorvalue and accordingly will naturally fit the register size withoutwasted register bit space when the register r2 and the register in whichthe remainder value is stored (r4) have the same width. Thus, in thecase of 32-bit register the denominator quotient can have 33 terms andthe remainder quotient can have 32 terms.

As will be seen in FIG. 2, the polynomial divide instruction DIVL.Pnreturns a quotient value representing the quotient polynomial intoregister r3 and a remainder value representing a remainder polynomialinto register r4. In the syntax illustrated it will be seen that thedenominator quotient is extended with a highest degree term x^(N) inaddition to the terms with coefficients specified by the bits of thedenominator value stored within register r2. The numerator polynomialtakes as its high degree portion terms specified by the coefficientsstored within register r1 (indicated as being boosted to the high degreeportion by multiplication by x^(N)) with the low degree portion of thenumerator polynomial being formed with the terms having coefficientstaken from the register r0.

It will be seen that the degree of the polynomials being manipulated isrepresented in this general syntax by the variable N. It will beappreciated that this can take a variety of different values and thepolynomials being manipulated can be, for example, degree 8, 16 or 32depending upon the data processing requirements. Other values for N arealso possible.

One way of viewing the polynomial divide instruction in the case of N=32is that it gives a result equivalent to the following C program code:

poly32_t q=x0, r=x1, p=x2; int C,i; for (i=0; i<32; i++) {     C =r>>31;     r = r<<1;     if (C)     {         r = r{circumflex over( )}p;     }     if (q>>31)     {         r = r{circumflex over ( )}1;    }     q = (q<<1) | C; } *x3 = q;    /*   ((x1<<32) +x0) div((1<<32)+x2)  */ return r;    /*   ((x1<<32) +x0) mod ((1<<32)+x2) */

FIG. 3 schematically illustrates an example polynomial divisionoperation. In this case N=4 resulting in a denominator polynomial withup to five terms and a numerator polynomial with up to eight terms. Inthe example illustrated the coefficients for the denominator polynomialterms x³ and x are both zero. Accordingly, the denominator polynomial isx⁴+x²+1. The denominator value stored within register r2 is “0101” andthis is extended at its most significant end by a value of “1” to givethe coefficients for the denominator polynomial as it is assumed thatthe highest degree term within the denominator polynomial always has acoefficient of “1”. It will be appreciated by those in this technicalfield that this assumption may require alignment of the denominatorpolynomial with an associated alignment in any numerator polynomial inorder that the assumption is correct. However, denominator polynomialsare often quasi-fixed values and accordingly such an alignment of thedenominator polynomial will normally not be required for everycalculation. Furthermore, the numerator polynomial is often a datastream and accordingly alignment within such data stream is normallyachieved by picking an appropriate starting point.

The action of the polynomial divide instruction is similar to a longdivision instruction. In polynomial arithmetic over a field of twoelements (i.e. the coefficient of the terms can be either “0” or “1”)addition and subtraction are equivalent to an exclusive-OR function.Multiplication is equivalent to an AND operation. These operations areperformed in respect of the terms of the same degree. In the exampleillustrated the quotient polynomial resulting from the polynomial divideinstruction is “x³+x²+x” and this is represented by a quotient value of“1110”. The remainder polynomial is “1” and this is represented by aremainder value of “0001”.

FIG. 4 schematically illustrates circuitry for implementing a polynomialdivide instruction of degree 4, such as may be provided by thepolynomial division circuitry 26 of FIG. 1. As will be seen, thecircuitry for performing the polynomial divide operation in response tothe polynomial divide instruction comprises an arrangement of AND gatesand XOR gates. The inputs to these gates are the numerator value [n₇:n₀]and the denominator value [1:p₀].

The circuitry in FIG. 4 is controlled by control signals generated bythe instruction decoder 22 in response to a polynomial divideinstruction in that the circuitry is activated for use and the numeratorvalue and denominator value are read out from the register bank 12 andsupplied to the inputs in the circuitry of FIG. 4. Furthermore, thequotient value [q₃:q₀] and the remainder value [r₃:r₀] are similarlyread from the outputs of the circuitry of FIG. 4 and stored back intoregisters within the register bank 12. In this example, two write portsare provided to the register bank 12 in order to allow both the quotientvalue the remainder value to be written therein.

It will be appreciated that the circuit of FIG. 4 is for calculatingpolynomial divides with a degree of four. However, regular extension ofthis circuitry provides polynomial divide instruction operation forpolynomial divides of a higher degree and in practice the same circuitrycan be reused for these divides of differing degree with appropriatemultiplexing of signal values, as will be familiar to those in thistechnical field.

Given below is an example of register transfer language (RTL) definingcircuitry for performing a polynomial divide of either 32, 16 or 8degree.

> // Polynomial divider >//----------------------------------------------------------------------------- > > //-------------------------------------------------------- > //Divider Exstage > //--------------------------------------------------------- > always@ (posedge clk) >  if (cmd_div_en) begin >   d_din_h_ex <=din_a_sz; >   d_poly_ex <= din_b_sh; >  end > // Common terms > assignd_i = d_din_h_ex; > assign d_p = d_poly_ex; > assign d_p_29 = d_p[30]{circumflex over ( )} d_p[31]; > assign d_p_28 = d_p[29] {circumflexover ( )} d_p[31]; > assign d_p_27 = d_p[28] {circumflex over ( )}(d_p[30] | d_p[31]); > assign d_p_26 = d_p[27] {circumflex over ( )}(d_p[31] & ~(d_p[29] {circumflex over ( )} d_p[30])); > assign d_p_25 =d_p[26] {circumflex over ( )} d_p[29] {circumflex over ( )} (d_p[31] &d_p[28]) {circumflex over ( )} (d_p[31] > | d_p[30]); > > // Divider -1st 8-bits > assign d_o[31] = d_i[31]; > assign d_o[30] = d_i[30]{circumflex over ( )} >      (d_i[31] & d_p[31]); > assign d_o[29] =d_i[29] {circumflex over ( )} >      (d_i[30] & d_p[31]) {circumflexover ( )} >      (d_i[31] & d_p_29); > assign d_o[28] = d_i[28]{circumflex over ( )} >      (d_i[31] & d_p_28) {circumflex over( )} >      (d_i[30] & d_p_29) {circumflex over ( )} >      (d_i[29] &d_p[31]); > assign d_o[27] = d_i[27] {circumflex over( )} >      (d_i[28] & d_p[31]) {circumflex over ( )} >      (d_i[29] &d_p_29) {circumflex over ( )} >      (d_i[30] & d_p_28) {circumflex over( )} >      (d_i[31] & d_p_27); > assign d_o[26] = d_i[26] {circumflexover ( )} >      (d_i[27] & d_p[31]) {circumflex over( )} >      (d_i[28] & d_p_29) {circumflex over ( )} >      (d_i[29] &d_p_28) {circumflex over ( )} >      (d_i[30] & d_p_27) {circumflex over( )} >      (d_i[31] & d_p_26); > assign d_o[25] = d_i[25] {circumflexover ( )} >      (d_i[26] & d_p[31]) {circumflex over( )} >      (d_i[27] & d_p_29) {circumflex over ( )} >      (d_i[28] &d_p_28) {circumflex over ( )} >      (d_i[29] & d_p_27) {circumflex over( )} >      (d_i[30] & d_p_26) {circumflex over ( )} >      (d_i[31] &(d_p[26] {circumflex over ( )} d_p[29] {circumflex over ( )} (d_p[31]& > d_p[28]) {circumflex over ( )} (d_p[31] | d_p[30]))); assign d_o[24]= d_i[24] {circumflex over ( )} >      (d_i[25] & d_p[31]) {circumflexover ( )} >      (d_i[26] & d_p_29) {circumflex over( )} >      (d_i[27] & d_p_28) {circumflex over ( )} >      (d_i[28] &d_p_27) {circumflex over ( )} >      (d_i[29] & d_p_26) {circumflex over( )} >      (d_i[30] & (d_p[26] {circumflex over ( )} d_p[29]{circumflex over ( )} (d_p[31] & >      d_p[28]) {circumflex over ( )}(d_p[31] | d_p[30]))) {circumflex over ( )} (d_i[31] & >      d_p[25]){circumflex over ( )} (d_i[31] & d_p[31] & d_p[27]) {circumflex over( )} >      (d_i[31] & d_p[30] & d_p[29]) {circumflex over( )} >      (d_i[31] & d_p[31]); > assign dpp_31 = { { 1{1′b0}},({‘AR1DPU_INT_WIDTH{d_o[31]}} & > d_poly_ex), {31{1′b0}} }; assigndpp_30 = { { 2{1′b0}}, > ({‘AR1DPU_INT_WIDTH{d_o[30]}} & d_poly_ex),{30{1′b0}} }; assign > dpp_29 = { { 3{1′b0}},({‘AR1DPU_INT_WIDTH{d_o[29]}} & d_poly_ex), > {29{1′b0}} }; assigndpp_28 = { { 4{1′b0}}, > ({‘AR1DPU_INT_WIDTH{d_o[28]}} & d_poly_ex),{28{1′b0}} }; assign > dpp_27 = { { 5{1′b0}},({‘AR1DPU_INT_WIDTH{d_o[27]}} & d_poly_ex), > {27{1′b0}} }; assigndpp_26 = { { 6{1′b0}}, > ({‘AR1DPU_INT_WIDTH{d_o[26]}} & d_poly_ex),{26{1′b0}} }; assign >      dpp_25 = { {7{1′b0}}, >      ({‘AR1DPU_INT_WIDTH{d_o[25]}} &d_poly_ex), > {25{1′b0}} }; assign dpp_24 = { {8{1′b0}}, > ({‘AR1DPU_INT_WIDTH{d_o[24]}} & d_poly_ex), {24{1′b0}} };assign > dp_24 = (({d_i, {32{1′b0}}} {circumflex over ( )} dpp_31){circumflex over ( )} dpp_28) {circumflex over ( )} ((dpp_30 {circumflexover ( )} dpp_29) > {circumflex over ( )} dpp_27) {circumflex over ( )}(dpp_26 {circumflex over ( )} dpp_25) {circumflex over ( )}dpp_24; > > // Divider - 2nd 8-bits > assign d_o[23] =dp_24[23+32]; > assign d_o[22] = dp_24[22+32] {circumflex over( )} >      (dp_24[23+32] & d_p[31]); > assign d_o[21] = dp_24[21+32]{circumflex over ( )} >      (dp_24[22+32] & d_p[31]) {circumflex over( )} >      (dp_24[23+32] & d_p_29); > assign d_o[20] = dp_24[20+32]{circumflex over ( )} >      (dp_24[23+32] & d_p_28) {circumflex over( )} >      (dp_24[22+32] & d_p_29) {circumflex over( )} >      (dp_24[21+32] & d_p[31]); > assign d_o[19] = dp_24[19+32]{circumflex over ( )} >      (dp_24[20+32] & d_p[31]) {circumflex over( )} >      (dp_24[21+32] & d_p_29) {circumflex over( )} >      (dp_24[22+32] & d_p_28) {circumflex over( )} >      (dp_24[23+32] & d_p_27); > assign d_o[18] = dp_24[18+32]{circumflex over ( )} >      (dp_24[19+32] & d_p[31]) {circumflex over( )} >      (dp_24[20+32] & d_p_29) {circumflex over( )} >      (dp_24[21+32] & d_p_28) {circumflex over( )} >      (dp_24[22+32] & d_p_27) {circumflex over( )} >      (dp_24[23+32] & d_p_26); > assign d_o[17] = dp_24[17+32]{circumflex over ( )} >      (dp_24[18+32] & d_p[31]) {circumflex over( )} >      (dp_24[19+32] & d_p_29) {circumflex over( )} >      (dp_24[20+32] & d_p_28) {circumflex over( )} >      (dp_24[21+32] & d_p_27) {circumflex over( )} >      (dp_24[22+32] & d_p_26) {circumflex over( )} >      (dp_24[23+32] & (d_p[26] {circumflex over ( )} d_p[29]{circumflex over ( )} (d_p[31] & > d_p[28]) {circumflex over ( )}(d_p[31] | d_p[30]))); assign d_o[16] = dp_24[16+32] {circumflex over( )} >      (dp_24[17+32] & d_p[31]) {circumflex over( )} >      (dp_24[18+32] & d_p_29) {circumflex over( )} >      (dp_24[19+32] & d_p_28) {circumflex over( )} >      (dp_24[20+32] & d_p_27) {circumflex over( )} >      (dp_24[21+32] & d_p_26) {circumflex over( )} >      (dp_24[22+32] & (d_p[26] {circumflex over ( )} d_p[29]{circumflex over ( )} (d_p[31] & >      d_p[28]) {circumflex over ( )}(d_p[31] | d_p[30]))) {circumflex over ( )} (dp_24[23+32]& >      d_p[25]) {circumflex over ( )} (dp_24[23+32] & d_p[31] &d_p[27]) {circumflex over ( )} >      (dp_24[23+32] & d_p[30] & d_p[29]){circumflex over ( )} >      (dp_24[23+32] & d_p[31]); > assign dpp_23 ={ { 9{1′b0}}, ({‘AR1DPU_INT_WIDTH{d_o[23]}} & > d_poly_ex), {23{1′b0}}}; assign dpp_22 = { {10{1′b0}}, > ({‘AR1DPU_INT_WIDTH{d_o[22]}} &d_poly_ex), {22{1′b0}} }; assign > dpp_21 = { {11{1′b0}},({‘AR1DPU_INT_WIDTH{d_o[21]}} & d_poly_ex), > {21{1′b0}} }; assigndpp_20 = { {12{1′b0}}, > ({‘AR1DPU_INT_WIDTH{d_o[20]}} & d_poly_ex),{20{1′b0}} }; assign > dpp_19 = { {13{1′b0}},({‘AR1DPU_INT_WIDTH{d_o[19]}} & d_poly_ex), > {19{1′b0}} }; assigndpp_18 = { {14{1′b0}}, > ({‘AR1DPU_INT_WIDTH{d_o[18]}} & d_poly_ex),{18{1′b0}} }; assign >      dpp_17 = {{15{1′b0}}, >      ({‘AR1DPU_INT_WIDTH{d_o[17]}} &d_poly_ex), > {17{1′b0}} }; assign dpp_16 = {{16{1′b0}}, > ({‘AR1DPU_INT_WIDTH{d_o[16]}} & d_poly_ex), {16{1′b0}} };assign > dp_16 = ((dp_24 {circumflex over ( )} dpp_23) {circumflex over( )} dpp_20) {circumflex over ( )} ((dpp_22 {circumflex over ( )}dpp_21) {circumflex over ( )} dpp_19) {circumflex over ( )} > (dpp_18{circumflex over ( )} dpp_17) {circumflex over ( )}dpp_16; > > //-------------------------------------------------------- > //Divider Ex2stage > //-------------------------------------------------------- > //Note that d_poly_ex is re-used in Ex2 (it must not change!) > //REVISIT, merge Ex and Ex2 stages to reduce area > > always @ (posedgeclk) >  if (cmd_div_ex) begin >   d_din_l_ex2 <=din_b_sh; >   dp_ex2  <= {d_o[31:16], dp_16[47:0]}; >  end > > //Divider - 1st 8-bits > assign d_o[15] = dp_ex2[15+32]; > assign d_o[14]= dp_ex2[14+32] {circumflex over ( )} >      (dp_ex2[15+32] &d_p[31]); > assign d_o[13] = dp_ex2[13+32] {circumflex over( )} >      (dp_ex2[14+32] & d_p[31]) {circumflex over( )} >      (dp_ex2[15+32] & d_p_29); > assign d_o[12] = dp_ex2[12+32]{circumflex over ( )} >      (dp_ex2[13+32] & d_p[31]) {circumflex over( )} >      (dp_ex2[14+32] & d_p_29) {circumflex over( )} >      (dp_ex2[15+32] & d_p_28); > assign d_o[11] = dp_ex2[11+32]{circumflex over ( )} >      (dp_ex2[12+32] & d_p[31]) {circumflex over( )} >      (dp_ex2[13+32] & d_p_29) {circumflex over( )} >      (dp_ex2[14+32] & d_p_28) {circumflex over( )} >      (dp_ex2[15+32] & d_p_27); > assign d_o[10] = dp_ex2[10+32]{circumflex over ( )} >      (dp_ex2[11+32] & d_p[31]) {circumflex over( )} >      (dp_ex2[12+32] & (d_p[31] {circumflex over ( )} d_p[30])){circumflex over ( )} >      (dp_ex2[13+32] & d_p_28) {circumflex over( )} >      (dp_ex2[14+32] & d_p_27) {circumflex over( )} >      (dp_ex2[15+32] & d_p_26); > assign d_o[9] = dp_ex2[9+32]{circumflex over ( )} >      (dp_ex2[10+32] & d_p[31]) {circumflex over( )} >      (dp_ex2[11+32] & (d_p[31] {circumflex over ( )} d_p[30])){circumflex over ( )} >      (dp_ex2[12+32] & d_p_28) {circumflex over( )} >      (dp_ex2[13+32] & d_p_27) {circumflex over( )} >      (dp_ex2[14+32] & d_p_26) {circumflex over( )} >      (dp_ex2[15+32] & (d_p[26] {circumflex over ( )} d_p[29]{circumflex over ( )} (d_p[31] & > d_p[28]) {circumflex over ( )}(d_p[31] | d_p[30]))); assign d_o[8] = dp_ex2[8+32] {circumflex over( )} >      (dp_ex2[ 9+32] & d_p[31]) {circumflex over( )} >      (dp_ex2[10+32] & (d_p[31] {circumflex over ( )} d_p[30])){circumflex over ( )} >      (dp_ex2[11+32] & d_p_28) {circumflex over( )} >      (dp_ex2[12+32] & d_p_27) {circumflex over( )} >      (dp_ex2[13+32] & d_p_26) {circumflex over( )} >      (dp_ex2[14+32] & (d_p[26] {circumflex over ( )} d_p[29]{circumflex over ( )} (d_p[31] & >      d_p[28]) {circumflex over ( )}(d_p[31] | d_p[30]))) {circumflex over ( )} (dp_ex2[15+32] >      &d_p[25]) {circumflex over ( )} (dp_ex2[15+32] & d_p[31] & d_p[27]){circumflex over ( )} >      (dp_ex2[15+32] & d_p[30] & d_p[29]){circumflex over ( )} >      (dp_ex2[15+32] & d_p[31]); > assign dpp_15= { {17{1′b0}}, ({‘AR1DPU_INT_WIDTH{d_o[15]}} & > d_poly_ex), {15{1′b0}}}; assign dpp_14 = { {18{1′b0}}, > ({‘AR1DPU_INT_WIDTH{d_o[14]}} &d_poly_ex), {14{1′b0}} }; assign > dpp_13 = { {19{1′b0}},({‘AR1DPU_INT_WIDTH{d_o[13]}} & d_poly_ex), > {13{1′b0}} }; assigndpp_12 = { {20{1′b0}}, > ({‘AR1DPU_INT_WIDTH{d_o[12]}} & d_poly_ex),{12{1′b0}} }; assign > dpp_11 = { {21{1′b0}},({‘AR1DPU_INT_WIDTH{d_o[11]}} & d_poly_ex), > {11{1′b0}} }; assigndpp_10 = { {22{1′b0}}, > ({‘AR1DPU_INT_WIDTH{d_o[10]}} & d_poly_ex),{10{1′b0}} }; assign >      dpp_9 = { {23{1′b0}},({‘AR1DPU_INT_WIDTH{d_o[9 >      ]}} & d_poly_ex), { 9{1′b0}} }; assigndpp_8 = { >      {24{1′b0}}, ({‘AR1DPU_INT_WIDTH{d_o[8]}}& > d_poly_ex), { 8{1′b0}} }; assign dp_8 = ({{32{1′b0}},d_din_l_ex2} > {circumflex over ( )} dp_ex2) {circumflex over ( )}((dpp_15 {circumflex over ( )} dpp_14) {circumflex over ( )} dpp_11){circumflex over ( )} ((dpp_13 {circumflex over ( )} dpp_12) {circumflexover ( )} > dpp_10) {circumflex over ( )} (dpp_9 {circumflex over ( )}dpp_8); > // Divider - 2nd 8-bits > assign d_o[7] = dp_8[7+32]; > assignd_o[6] = dp_8[6+32] {circumflex over ( )} >      (dp_8[7+32] &d_p[31]); > assign d_o[5] = dp_8[5+32] {circumflex over( )} >      (dp_8[6+32] & d_p[31]) {circumflex over( )} >      (dp_8[7+32] & d_p_29); > assign d_o[4] = dp_8[4+32]{circumflex over ( )} >      (dp_8[7+32] & d_p_28) {circumflex over( )} >      (dp_8[6+32] & d_p_29) {circumflex over( )} >      (dp_8[5+32] & d_p[31]); > assign d_o[3] = dp_8[3+32]{circumflex over ( )} >      (dp_8[4+32] & d_p[31]) {circumflex over( )} >      (dp_8[5+32] & d_p_29) {circumflex over( )} >      (dp_8[6+32] & d_p_28) {circumflex over( )} >      (dp_8[7+32] & d_p_27); > assign d_o[2] = dp_8[2+32]{circumflex over ( )} >      (dp_8[3+32] & d_p[31]) {circumflex over( )} >      (dp_8[4+32] & (d_p[31] {circumflex over ( )} d_p[30])){circumflex over ( )} >      (dp_8[5+32] & d_p_28) {circumflex over( )} >      (dp_8[6+32] & d_p_27) {circumflex over( )} >      (dp_8[7+32] & d_p_26); > assign d_o[1] = dp_8[1+32]{circumflex over ( )} >      (dp_8[2+32] & d_p[31]) {circumflex over( )} >      (dp_8[3+32] & (d_p[31] {circumflex over ( )} d_p[30])){circumflex over ( )} >      (dp_8[4+32] & d_p_28) {circumflex over( )} >      (dp_8[5+32] & d_p_27) {circumflex over( )} >      (dp_8[6+32] & d_p_26) {circumflex over( )} >      (dp_8[7+32] & (d_p[26] {circumflex over ( )} d_p[29]{circumflex over ( )} (d_p[31] & > d_p[28]) {circumflex over ( )}(d_p[31] | d_p[30]))); assign d_o[0] = dp_8[0+32] {circumflex over( )} >      (dp_8[1+32] & d_p[31]) {circumflex over( )} >      (dp_8[2+32] & (d_p[31] {circumflex over ( )} d_p[30])){circumflex over ( )} >      (dp_8[3+32] & d_p_28) {circumflex over( )} >      (dp_8[4+32] & d_p_27) {circumflex over( )} >      (dp_8[5+32] & d_p_26) {circumflex over( )} >      (dp_8[6+32] & (d_p[26] {circumflex over ( )} d_p[29]{circumflex over ( )} (d_p[31] & >      d_p[28]) {circumflex over ( )}(d_p[31] | d_p[30]))) {circumflex over ( )} (dp_8[7+32]& >      d_p[25]) {circumflex over ( )} (dp_8[7+32] & d_p[31] & d_p[27]){circumflex over ( )} >      (dp_8[7+32] & d_p[30] & d_p[29]){circumflex over ( )} >      (dp_8[7+32] & d_p[31]); > assign dpp_7 = {{25{1′b0}}, ({‘AR1DPU_INT_WIDTH{d_o[7]}} & > d_poly_ex), {7{1′b0}} };assign dpp_6 = { {26{1′b0}}, > ({‘AR1DPU_INT_WIDTH{d_o[6]}} &d_poly_ex), {6{1′b0}} }; assign > dpp_5 = { {27{1′b0}},({‘AR1DPU_INT_WIDTH{d_o[5]}} & d_poly_ex), > {5{1′b0}} }; assign dpp_4 {{28{1′b0}}, > ({‘AR1DPU_INT_WIDTH{d_o[4]}} & d_poly_ex), {4{1′b0}} };assign > dpp_3 = { {29{1′b0}}, ({‘AR1DPU_INT_WIDTH{d_o[3]}} &d_poly_ex), > {3{1′b0}} }; assign dpp_2 = {{20{1′b0}}, > ({‘AR1DPU_INT_WIDTH{d_o[2]}} & d_poly_ex), {2{1′b0}} };assign >      dpp_1 = { {31{1′b0}}, ({‘AR1DPU_INT_WIDTH{d_o[1]}}& >      d_poly_ex), {1{1′b0}} }; assign dpp_0 = { > {32{1′b0}},({‘AR1DPU_INT_WIDTH{d_o[0]}} & d_poly_ex) }; assign dp_0 > = ((dp_8{circumflex over ( )} dpp_7) {circumflex over ( )} dpp_4) {circumflexover ( )} ((dpp_6 {circumflex over ( )} dpp_5) {circumflex over ( )}dpp_3) {circumflex over ( )} (dpp_2 {circumflex over ( )} > dpp_1){circumflex over ( )} dpp_0; > > //------------------------------------ > // Divider Wr stage > //------------------------------------ > always @ (posedge clk) >  if(cmd_div_ex2) begin >   dout_wr <= {dp_ex2[63:48], d_o[15:0],dp_0[31:0]}; >  end > > assign d_rout_wr = cmd_size_wr[1] ?dout_wr[31:0] : ( >       cmd_size_wr[0] ? {{16{1′b0}}, dout_wr[31:16]}: >            {{24{1′b0}}, dout_wr[31:24]}); > >//----------------------------------------------------------------------------- > // Output >//----------------------------------------------------------------------------- > > assign {dout, rout} = cmd_div_wr ? {dout_wr[63:32],d_rout_wr} : > m_out;

FIG. 5 illustrates a syntax that may be used for a vector polynomialdivide instruction. This syntax is similar to that illustrated in FIG.2, other than the registers storing the numerator value and theresulting quotient value and remainder value being replaced by vectorregisters. The denominator remains a scalar value stored in a scalarregister as it will be appreciated that the denominator quotient anddenominator value will often be constant for a long sequence ofnumerator values. This is the type of behaviour which is associated witha scrambler program for processing in parallel multiple streams ofsignal data to be transmitted. The quotient polynomial and the quotientvalue form the data to be transmitted with characteristics more suitablefor transmission than the raw numerator value (e.g. a long sequence ofconstant bit values within the numerator value will be turned into amore readily transmitted alternating pattern of bit values within thequotient value). The polynomial divide instruction of the presenttechnique which generates a quotient value is well suited for use withsuch scrambler programs seeking to scramble a signal to be transmitted.

FIG. 6 illustrates two ways in which the coefficients of the terms of apolynomial may be stored within registers. In particular, thecoefficient of the highest degree term may be stored in either the mostsignificant bit position or the least significant bit position within aregister storing a value representing the polynomial coefficients. Theother coefficients can follow in turn from this relevant selected endpoint. This is similar to the coefficients being stored within theregisters in either big endian or little endian form.

The polynomial divide instruction of the present technique provides anadvantageous combination with a polynomial multiply instruction, whichcan also be supported by the same processor 4. In this case anadditional processing unit similar to polynomial divide circuitry 26illustrated in FIG. 1 may be added in order to support a polynomialmultiply instruction. FIG. 7 illustrates a syntax that may be used forsuch a polynomial multiply instruction. The first polynomial and thesecond polynomial are represented by values stored within respectiveregisters r0 and r1. The resulting product polynomial is represented bya product value stored within register r2. The product polynomial valueis double the length of the first and second polynomials and accordinglyregister r2 is twice the length of the registers storing the first andsecond values representing the first and second polynomials. In practicethe register r2 can be provided by a combination of two standard lengthregisters. Alternatively, the register bank 12 could include one or moredouble width registers such as the 2N-bit registers 14 illustrated inFIG. 1. Such double width registers are often provided for use withmultiply accumulate instructions in standard scalar arithmetic andaccordingly can be reused for this type of polynomial multiplyinstruction. These double width registers may also be used to store thenumerator value in respect of the polynomial divide instructionspreviously discussed. In that case the double width register wouldreplace registers r0 and r1 illustrated in FIG. 2 with a single doublewidth register storing the values representing all of the coefficientsof the numerator polynomial within a single register.

The operation of a polynomial multiply instruction of various formsincluding different width versions and a version incorporating anaccumulate is given in the following illustrative C code. This code alsoincludes similar representations of polynomial divide instructions. Theworker in this technical field will understand that these definitionsare the action of these instructions can be used to generate therelevant circuitry to perform such operations in response to anassociated instruction specifying that operation.

/*-------------------------------------------------------------* *polynomial multiply long*-------------------------------------------------------------*/ poly8_tmull_p8(poly8_t x0, poly8_t x1, poly8_t *x2) { #ifdef _OPTIMODE_(—)#pragma OUT x2 #pragma INTRINSIC #endif  poly8_t q=x0, r=0;  int C,i; for (i=0; i<8; i++)  {    C = r>>7;    r = r<<1;    if (q>>7)    {    r = r{circumflex over ( )}x1;    }    q = (q<<1)|C;  }  *x2 = q;  /*(x0*x1) high 8 bits */  return r; /* (x0*x1) low 8 bits */ } poly16_tmull_p16(poly16_t x0, poly16_t x1, poly16_t *x2) { #ifdef _OPTIMODE_(—)#pragma OUT x2 #pragma INTRINSIC #endif  poly16_t q=x0, r=0;  int C,i; for (i=0; i<16; i++)  {    C = r>>15;    r = r<<1;    if (q>>15)    {    r = r{circumflex over ( )}x1;    }    q = (q<<1)|C;  }  *x2 = q;  /*(x0*x1) high 16 bits */  return r; /* (x0*x1) low 16 bits */ } poly32_tmull_p32(poly32_t x0, poly32_t x1, poly32_t *x2) { #ifdef _OPTIMODE_(—)#pragma OUT x2 #pragma INTRINSIC #endif  poly32_t q=x0, r=0;  int C,i; for (i=0; i<32; i++)  {    C = r>>31;    r = r<<1;    if (q>>31)    {    r = r{circumflex over ( )}x1;    }    q = (q<<1)|C;  }  *x2 = q; /*(x0*x1) high 32 bits */  return r; /* (x0*x1) low 32 bits */ }/*-------------------------------------------------------------* *polynomial multiply accumulate long*-------------------------------------------------------------*/ poly8_tmlal_p8(poly8_t x0, poly8_t x1, poly8_t x2, poly8_t x3, poly8_t *x4) {#ifdef _OPTIMODE_(—) #pragma OUT x4 #pragma INTRINSIC #endif  poly8_tq=x2, r=0;  int C,i;  for (i=0; i<8; i++)  {    C = r>>7;    r = r<<1;   if (q>>7)    {     r = r{circumflex over ( )}x3;    }    q =(q<<1)|C;  }  *x4 = q{circumflex over ( )}x1;  /* ((x1<<8)+x0)+(x2*x3)high 8 bits */  return r{circumflex over ( )}x0; /* ((x1<<8)+x0)+(x2*x3)low 8 bits */ } poly16_t mlal_p16(poly16_t x0, poly16_t x1, poly16_t x2,poly16_t x3, poly16_t *x4) { #ifdef _OPTIMODE_(—) #pragma OUT x4 #pragmaINTRINSIC #endif  poly16_t q=x2, r=0;  int C,i;  for (i=0; i<16; i++)  {   C = r>>15;    r = r<<1;    if (q>>15)    {     r = r{circumflex over( )}x3;    }    q = (q<<1)|C;  }  *x4 = q{circumflex over ( )}x1; /*((x1<<16)+x0)+(x2*x3) high 16 bits */  return r{circumflex over( )}x0; /* ((x1<<16)+x0)+(x2*x3) low 16 bits */ } poly32_tmlal_p32(poly32_t x0, poly32_t x1, poly32_t x2, poly32_t x3, poly32_t*x4) { #ifdef_OPTIMODE_(—) #pragma OUT x4 #pragma INTRINSIC #endif poly32_t q=x2, r=0;  int C,i;  for (i=0; i<32; i++)  {    C = r>>31;   r = r<<1;    if (q>>31)    {     r = r{circumflex over ( )}x3;    }   q = (q<<1)|C;  }  *x4 = q{circumflex over ( )}x1;  /*((x1<<32)+x0)+(x2*x3) high 32 bits */  return r{circumflex over ( )}x0;/* ((x1<<32)+x0)+(x2*x3) low 32 bits */ }/*-------------------------------------------------------------* *polynomial long divide*-------------------------------------------------------------*/ poly8_tdivl_p8(poly8_t x0, poly8_t x1, poly8_t x2, poly8_t *x3) {#ifdef_OPTIMODE_(—) #pragma OUT x3 #pragma INTRINSIC #endif  poly8_tq=x0, r=x1, p=x2;  int C,i;  for (i=0; i<8; i++)  {    C = r>>7;    r =r<<1;    if (C)    {     r = r{circumflex over ( )}p;    }    if (q>>7)   {     r = r{circumflex over ( )}1;    }    q = (q<<1)|C;  }  *x3 =q;  /* ((x1<<8)+x0) div ((1<<8)+x2) */  return r; /* ((x1<<8)+x0) mod((1<<8)+x2) */ } poly16_t divl_p16(poly16_t x0, poly16_t x1, poly16_tx2, poly16_t *x3) { #ifdef_OPTIMODE_(—) #pragma OUT x3 #pragma INTRINSIC#endif  poly16_t q=x0, r=x1, p=x2;  int C,i;  for (i=0; i<16; i++)  {   C = r>>15;    r = r<<1;    if (C)    {     r = r{circumflex over( )}p;    }    if (q>>15)    {     r = r{circumflex over ( )}1;    }   q = (q<<1)|C;  }  *x3 = q; /* ((x1<<16)+x0) div ((1<<16)+x2) */ return r; /* ((x1<<16)+x0) mod ((1<<16)+x2) */ } poly32_tdivl_p32(poly32_t x0, poly32_t x1, poly32_t x2, poly32_t *x3) {#ifdef_OPTIMODE_(—) #pragma OUT x3 #pragma INTRINSIC #endif  poly32_tq=x0, r=x1, p=x2;  int C,i;  for (i=0; i<32; i++)  {    C = r>>31;    r= r<<1;    if (C)    {     r = r{circumflex over ( )}p;    }    if(q>>31)    {     r = r{circumflex over ( )}1;    }    q = (q<<1)|C;  } *x3 = q;  /* ((x1<<32)+x0) div ((1<<32)+x2) */  return r; /*((x1<<32)+x0) mod ((1<<32)+x2) */ }

Whilst the above described techniques may be performed by hardwareexecuting a sequence of native instructions which include theabove-mentioned instructions, it will be appreciated that in alternativeembodiments, such instructions may be executed in a virtual machineenvironment, where the instructions are native to the virtual machine,but the virtual machine is implemented by software executing on hardwarehaving a different native instruction set. The virtual machineenvironment may provide a full virtual machine environment emulatingexecution of a full instruction set or may be partial, e.g. only someinstructions, including the instructions of the present technique, aretrapped by the hardware and emulated by the partial virtual machine.

More specifically, the above-described re-arrangement instructions maybe executed as native instructions to the full or partial virtualmachine, with the virtual machine together with its underlying hardwareplatform operating in combination to provide the polynomial processingdescribed above.

Although illustrative embodiments of the invention have been describedin detail herein with reference to the accompanying drawings, it is tobe understood that the invention is not limited to those preciseembodiments, and that various changes and modifications can be effectedtherein by one skilled in the art without departing from the scope andspirit of the invention as defined by the appended claims.

The invention claimed is:
 1. Apparatus for processing data comprising:an instruction decoder responsive to a program instruction to generateone or more control signals; a register bank having a plurality ofregisters; and processing circuitry coupled to said instruction decoderand said register bank and responsive to said one or more controlsignals to perform a data processing operation corresponding to saidprogram instruction upon one or more data values stored within saidregister bank; wherein said instruction decoder is responsive to apolynomial divide instruction as a single instruction to generate one ormore control signals that control said processing circuitry to generateas an output stored in said plurality of registers at least a quotientvalue representing a quotient polynomial for a polynomial division overa field of two elements of a numerator polynomial by a denominatorpolynomial, said denominator polynomial being an N degree polynomialgiven by the sum of c_(i)x^(i) for N≧i≧0, where c_((N-1)) to c₀ arerespective bits stored in a register of said register bank and c_(N)=1and is not stored within said register, and the degree N of thedenominator polynomial is greater than one.
 2. Apparatus for processingdata comprising: an instruction decoder responsive to a programinstruction to generate one or more control signals; a register bankhaving a plurality of registers; and processing circuitry coupled tosaid instruction decoder and said register bank and responsive to saidone or more control signals to perform a data processing operationcorresponding to said program instruction upon one or more data valuesstored within said register bank; wherein said instruction decoder isresponsive to a polynomial divide instruction as a single instruction togenerate one or more control signals that control said processingcircuitry to generate as an output stored in said plurality of registersat least a quotient value representing a quotient polynomial for apolynomial division over a field of two elements of a numeratorpolynomial by a denominator polynomial, said denominator polynomialbeing an N degree polynomial given by the sum of c_(i)x^(i) for whereN≧i≧0, where c_((N-1)) to c₀ are respective bits stored in a register ofsaid register bank and c_(N)=1 and is not stored within said register,and said register is an N-bit register.
 3. Apparatus as claimed in claim1, wherein said register bank comprises a plurality of N-bit registers.4. Apparatus as claimed in claim 1, wherein a polynomial is representedby one of: (i) a value within an M-bit register with a coefficient c_(k)for term x^(k) being bit k of said M-bit register, where (M−1)≧k≧0; and(ii) a value within an M-bit register with a coefficient c_(k) for termx^(k) being bit (M−1)−k of said M-bit register, where (M−1)≧k≧0. 5.Apparatus as claimed in claim 3, wherein said numerator polynomial isrepresented by a 2N-bit numerator value stored within two of saidplurality of N-bit registers.
 6. Apparatus as claimed in claim 1,wherein said numerator polynomial is represented by a 2N-bit numeratorvalue stored within a 2N-bit register of said plurality of registers. 7.Apparatus as claimed in claim 1, wherein said processing circuitry iscontrolled by said control signals generated by said instruction decoderin response to said polynomial divide instruction to generate aremainder value representing a remainder polynomial resulting frompolynomial division of said numerator polynomial by said denominatorpolynomial.
 8. Apparatus as claimed in claim 7, wherein said remaindervalue is an N-bit remainder value stored within an N-bit register ofsaid plurality of registers.
 9. Apparatus as claimed in claim 1, whereinsaid quotient value is an N-bit quotient value stored within an N-bitregister of said plurality of registers.
 10. Apparatus as claimed inclaim 1, wherein said polynomial divide instruction is part of scramblerprogram code executed by said apparatus to scramble a signal to betransmitted using generated quotient values.
 11. Apparatus as claimed inclaim 1, wherein said register bank comprises a plurality of generalpurpose scalar registers used by program instructions other than saidpolynomial divide instruction.
 12. Apparatus as claimed in claim 1,wherein said instruction decoder is responsive to a polynomial multiplyinstruction to generate one or more control signals that control saidprocessing circuitry to generate at least a product value representing aproduct polynomial for a polynomial multiplication over a field of twoelements of a first polynomial by a second polynomial.
 13. Apparatus asclaimed in claim 1, wherein said polynomial divide instruction is avector instruction with said denominator value being a scalar value andsaid quotient value and a numerator value representing said numeratorpolynomial being vector values.
 14. A method of processing datacomprising the steps of: decoding a program instruction to generate oneor more control signals; in response to said one or more controlsignals, performing a data processing operation corresponding to saidprogram instruction upon one or more data values stored within aregister bank having a plurality of registers; wherein said decoding isresponsive to a polynomial divide instruction as a single instruction togenerate one or more control signals that control generation as anoutput stored in said plurality of registers of at least a quotientvalue representing a quotient polynomial for a polynomial division overa field of two elements of a numerator polynomial by a denominatorpolynomial, said denominator polynomial being an N degree polynomialgiven by the sum of c_(i)x^(i) for N≧i≧0, where c_((N-1)) to c₀ arerespective bits stored in a register of said register bank and c_(N)=1and is not stored within said register, and the degree N of thedenominator polynomial is greater than one.
 15. A method of processingdata comprising the steps of: decoding a program instruction to generateone or more control signals; in response to said one or more controlsignals, performing a data processing operation corresponding to saidprogram instruction u on one or more data values stored within aregister bank having a plurality of registers; wherein said decoding isresponsive to a polynomial divide instruction as a single instruction togenerate one or more control signals that control generation as anoutput stored in said plurality of registers of at least a quotientvalue representing a quotient polynomial for a polynomial division overa field of two elements of a numerator polynomial by a denominatorpolynomial, said denominator polynomial being an N degree polynomialgiven by the sum of c_(i)x^(i) for N≧i≧0, where c_((N-1)) to c₀ arerespective bits stored in a register of said register bank and c_(N)=1and is not stored within said register, and said register is an N-bitregister.
 16. A method as claimed in claim 14, wherein said registerbank comprises a plurality of N-bit registers.
 17. A method as claimedin claim 14, wherein a polynomial is represented by one of: (i) a valuewithin an M-bit register with a coefficient c_(k) for term x^(k) beingbit k of said M-bit register, where (M−1)≧k≧0; and (ii) a value withinan M-bit register with a coefficient c_(k) for term x^(k) being bit(M−1)−k of said M-bit register, where (M−1)≧k≧0.
 18. A method as claimedin claim 16, wherein said numerator polynomial is represented by a2N-bit numerator value stored within two of said plurality of N-bitregisters.
 19. A method as claimed in claim 14, wherein said numeratorpolynomial is represented by a 2N-bit numerator value stored within a2N-bit register of said plurality of registers.
 20. A method as claimedin claim 14, wherein said control signals generated by decoding saidpolynomial divide instruction control generation of a remainder valuerepresenting a remainder polynomial resulting from polynomial divisionof said numerator polynomial by said denominator polynomial.
 21. Amethod as claimed in claim 20, wherein said remainder value is an N-bitremainder value stored within an N-bit register of said plurality ofregisters.
 22. A method as claimed in claim 14, wherein said quotientvalue is an N-bit quotient value stored within an N-bit register of saidplurality of registers.
 23. A method as claimed in claim 14, whereinsaid polynomial divide instruction is part of scrambler program codeexecuted to scramble a signal to be transmitted using generated quotientvalues.
 24. A method as claimed in claim 14, wherein said register bankcomprises a plurality of general purpose scalar registers used byprogram instructions other than said polynomial divide instruction. 25.A method as claimed in claim 14, wherein said decoding is responsive toa polynomial multiply instruction to generate one or more controlsignals that control generation of at least a product value representinga product polynomial for a polynomial multiplication over a field of twoelements of a first polynomial by a second polynomial.
 26. A method asclaimed in claim 14, wherein said polynomial divide instruction is avector instruction with said denominator value being a scalar value andsaid quotient value and a numerator value representing said numeratorpolynomial being vector values.
 27. A computer program productcomprising a computer program storage medium storing a computer programfor controlling an apparatus for processing data in accordance with amethod as claimed in claim
 14. 28. Apparatus as claimed in claim 1,wherein the degree N of the denominator polynomial is one of 8, 16, and32.
 29. A method as claimed in claim 14, wherein the degree N of thedenominator polynomial is one of 8, 16, and 32.